LangChain + ChatGLM2-6B的本地知识问答库

2023-10-24 GPT PV:

原项目Github：https://github.com/imClumsyPanda/langchain-ChatGLM

项目部署

· v 0.2.6

机器配置：

· python 环境：anaconda3 + python3.10.12

· GPU：RTX3090*2 + CUDA11.7

· torch：2.0.1（CUDA未升至12）

· conda：py310_dtglm

模型下载

· m3e https://huggingface.co/moka-ai/m3e-base/tree/main

· chatglm2-6b https://huggingface.co/THUDM/chatglm2-6b/tree/main

chatglm清华源 https://cloud.tsinghua.edu.cn/d/674208019e314311ab5c/?p=%2F&mode=list

(这里将模型全部下载至/root/huggingface下)

创建虚拟环境，安装依赖

conda create -n py310_dtglm python=3.10.12
conda activate py310_dtglm

pip install --use-pep517 -r requirements.txt -i https://mirror.baidu.com/pypi/simple
pip install --use-pep517 -r requirements_api.txt -i https://mirror.baidu.com/pypi/simple
pip install --use-pep517 -r requirements_webui.txt -i https://mirror.baidu.com/pypi/simple

修改配置、模型路径

复制配置文件

1	python copy_config_example.py

修改配置文件

· model_config.py

MODEL_ROOT_PATH = "/root/huggingface"

MODEL_PATH = {
    "embed_model": {
		...
        "m3e-base": "/root/huggingface/m3e-base", # 修改m3e模型路径
		...
    },
    # TODO: add all supported llm models
    "llm_model": {
		...
        "chatglm2-6b": "/root/huggingface/chatglm2-6b", # 修改chatglm2-6b模型路径
		...
    },
}

EMBEDDING_MODEL = "m3e-base" # 可以尝试最新的嵌入式sota模型：bge-large-zh-v1.5
LLM_MODEL = "chatglm2-6b"

· server_config.py

# webui.py server
WEBUI_SERVER = {
    "host": DEFAULT_BIND_HOST,
    "port": 8501,
}

# api.py server
API_SERVER = {
    "host": DEFAULT_BIND_HOST,
    "port": 7861,
}

# fastchat openai_api server
FSCHAT_OPENAI_API = {
    "host": DEFAULT_BIND_HOST,
    "port": 20000,
}

FSCHAT_MODEL_WORKERS = {
    "default": {
        "host": DEFAULT_BIND_HOST,
        "port": 20002,
        "device": LLM_DEVICE,
        "infer_turbo": False,

        # model_worker多卡加载需要配置的参数
        "gpus": "0,1", # 使用的GPU，以str的格式指定，如"0,1"，如失效请使用CUDA_VISIBLE_DEVICES="0,1"等形式指定
        "num_gpus": 2, # 使用GPU的数量
        "max_gpu_memory": "20GiB", # 每个GPU占用的最大显存

    },

    "zhipu-api": { # 请为每个要运行的在线API设置不同的端口
        "port": 21001,
    },
}

# fastchat controller server
FSCHAT_CONTROLLER = {
    "host": DEFAULT_BIND_HOST,
    "port": 20001,
    "dispatch_method": "shortest_queue",
}

初始化默认知识库

样例知识库文件位置：knowledge_base/samples/content/test.txt

1	python init_database.py --recreate-vs

启动项目

1	python startup.py -a

通过fastapi接口添加知识库

http://host:7861/knowledge_base/upload_docs

curl -X 'POST' \
  'http://host:7861/knowledge_base/upload_docs' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'to_vector_store=true' \
  -F 'override=false' \
  -F 'not_refresh_vs_cache=false' \
  -F 'chunk_size=250' \
  -F 'chunk_overlap=50' \
  -F 'zh_title_enhance=true' \
  -F 'files=@分体式M录AI智能分析设备建设方案.docx;type=application/vnd.openxmlformats-officedocument.wordprocessingml.document' \
  -F 'knowledge_base_name=琅琊' \
  -F 'docs='

选择知识库问答

代码调整

百川大模型接入

调整./configs/model_config.py

"llm_model": {
    ...
    'baichuan-13b-chat':'/home/Baichuan2-main/baichuan-inc/Baichuan2-13B-Chat',
    ...
}
    
# LLM 名称
LLM_MODEL = "baichuan-13b-chat"

接口流式输出

安装sse_starlette

pip install sse-starlette -i https://mirror.baidu.com/pypi/simple

进入./chat/*.py，修改接口

from sse_starlette.sse import EventSourceResponse
...
return EventSourceResponse(chat_iterator(query=query,
                                         history=history,
                                         model_name=model_name,
                                         prompt_name=prompt_name))

/*注释内容 */